{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Creating and Evaluating Solutions <a class=\"anchor\" id=\"top\"></a>\n",
    "\n",
    "In this notebook, you will train several models using Amazon Personalize.\n",
    "\n",
    "1. [Introduction](#intro)\n",
    "1. [Create solutions](#solutions)\n",
    "1. [Evaluate solutions](#eval)\n",
    "1. [Using evaluation metrics](#use)\n",
    "1. [Storing useful variables](#vars)\n",
    "\n",
    "## Introduction <a class=\"anchor\" id=\"intro\"></a>\n",
    "\n",
    "To recap, for the most part, the algorithms in Amazon Personalize (called recipes) look to solve different tasks, explained here:\n",
    "\n",
    "1. **HRNN & HRNN-Metadata** - Recommends items based on previous user interactions with items.\n",
    "1. **HRNN-Coldstart** - Recommends new items for which interaction data is not yet available.\n",
    "1. **Personalized-Ranking** - Takes a collection of items and then orders them in probable order of interest using an HRNN-like approach.\n",
    "1. **SIMS (Similar Items)** - Given one item, recommends other items also interacted with by users.\n",
    "1. **Popularity-Count** - Recommends the most popular items, if HRNN or HRNN-Metadata do not have an answer - this is returned by default.\n",
    "\n",
    "No matter the use case, the algorithms all share a base of learning on user-item-interaction data which is defined by 3 core attributes:\n",
    "\n",
    "1. **UserID** - The user who interacted\n",
    "1. **ItemID** - The item the user interacted with\n",
    "1. **Timestamp** - The time at which the interaction occurred\n",
    "\n",
    "We also support event types and event values defined by:\n",
    "\n",
    "1. **Event Type** - Categorical label of an event (browse, purchased, rated, etc).\n",
    "1. **Event Value** - A value corresponding to the event type that occurred. Generally speaking, we look for normalized values between 0 and 1 over the event types. For example, if there are three phases to complete a transaction (clicked, added-to-cart, and purchased), then there would be an event_value for each phase as 0.33, 0.66, and 1.0 respectfully.\n",
    "\n",
    "The event type and event value fields are additional data which can be used to filter the data sent for training the personalization model. In this particular exercise we will not have an event type or event value. \n",
    "\n",
    "To run this notebook, you need to have run the previous notebook, `01_Validating_and_Importing_User_Item_Interaction_Data`, where you created a dataset and imported interaction data into Amazon Personalize. At the end of that notebook, you saved some of the variable values, which you now need to load into this notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create solutions <a class=\"anchor\" id=\"solutions\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "In this notebook, you will create solutions with the following recipes:\n",
    "\n",
    "1. HRNN\n",
    "1. SIMS\n",
    "1. Personalized-Ranking\n",
    "\n",
    "Since you have not imported any metadata, and there are no coldstart items in this dataset, this notebook will skip the HRNN-Metadata and HRNN-Coldstart recipes. The Popularity-Count recipe is the simplest solution available in Amazon Personalize and it should only be used as a fallback, so it will also not be covered in this notebook.\n",
    "\n",
    "Similar to the previous notebook, start by importing the relevant packages, and set up a connection to Amazon Personalize using the SDK."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "from time import sleep\n",
    "import json\n",
    "\n",
    "import boto3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure the SDK to Personalize:\n",
    "personalize = boto3.client('personalize')\n",
    "personalize_runtime = boto3.client('personalize-runtime')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In Amazon Personalize, a specific variation of an algorithm is called a recipe. Different recipes are suitable for different situations. A trained model is called a solution, and each solution can have many versions that relate to a given volume of data when the model was trained.\n",
    "\n",
    "To start, we will list all the recipes that are supported. This will allow you to select one and use that to build your model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'recipes': [{'name': 'aws-hrnn',\n",
       "   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn',\n",
       "   'status': 'ACTIVE',\n",
       "   'creationDateTime': datetime.datetime(2018, 11, 26, 0, 0, tzinfo=tzlocal()),\n",
       "   'lastUpdatedDateTime': datetime.datetime(1970, 1, 1, 0, 0, tzinfo=tzlocal())},\n",
       "  {'name': 'aws-hrnn-coldstart',\n",
       "   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn-coldstart',\n",
       "   'status': 'ACTIVE',\n",
       "   'creationDateTime': datetime.datetime(2018, 11, 26, 0, 0, tzinfo=tzlocal()),\n",
       "   'lastUpdatedDateTime': datetime.datetime(1970, 1, 1, 0, 0, tzinfo=tzlocal())},\n",
       "  {'name': 'aws-hrnn-metadata',\n",
       "   'recipeArn': 'arn:aws:personalize:::recipe/aws-hrnn-metadata',\n",
       "   'status': 'ACTIVE',\n",
       "   'creationDateTime': datetime.datetime(2018, 11, 26, 0, 0, tzinfo=tzlocal()),\n",
       "   'lastUpdatedDateTime': datetime.datetime(1970, 1, 1, 0, 0, tzinfo=tzlocal())},\n",
       "  {'name': 'aws-personalized-ranking',\n",
       "   'recipeArn': 'arn:aws:personalize:::recipe/aws-personalized-ranking',\n",
       "   'status': 'ACTIVE',\n",
       "   'creationDateTime': datetime.datetime(2018, 11, 26, 0, 0, tzinfo=tzlocal()),\n",
       "   'lastUpdatedDateTime': datetime.datetime(1970, 1, 1, 0, 0, tzinfo=tzlocal())},\n",
       "  {'name': 'aws-popularity-count',\n",
       "   'recipeArn': 'arn:aws:personalize:::recipe/aws-popularity-count',\n",
       "   'status': 'ACTIVE',\n",
       "   'creationDateTime': datetime.datetime(2018, 11, 26, 0, 0, tzinfo=tzlocal()),\n",
       "   'lastUpdatedDateTime': datetime.datetime(1970, 1, 1, 0, 0, tzinfo=tzlocal())},\n",
       "  {'name': 'aws-sims',\n",
       "   'recipeArn': 'arn:aws:personalize:::recipe/aws-sims',\n",
       "   'status': 'ACTIVE',\n",
       "   'creationDateTime': datetime.datetime(2018, 11, 26, 0, 0, tzinfo=tzlocal()),\n",
       "   'lastUpdatedDateTime': datetime.datetime(1970, 1, 1, 0, 0, tzinfo=tzlocal())}],\n",
       " 'ResponseMetadata': {'RequestId': 'd0e8866a-1dc7-48af-b29a-46b67a140273',\n",
       "  'HTTPStatusCode': 200,\n",
       "  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',\n",
       "   'date': 'Thu, 09 Apr 2020 04:09:56 GMT',\n",
       "   'x-amzn-requestid': 'd0e8866a-1dc7-48af-b29a-46b67a140273',\n",
       "   'content-length': '989',\n",
       "   'connection': 'keep-alive'},\n",
       "  'RetryAttempts': 0}}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "personalize.list_recipes()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The output is just a JSON representation of all of the algorithms mentioned in the introduction.\n",
    "\n",
    "Next we will select specific recipes and build models with them.\n",
    "\n",
    "### HRNN\n",
    "\n",
    "HRNN (hierarchical recurrent neural network) is one of the more advanced recommendation models that you can use and it allows for real-time updates of recommendations based on user behavior. It also tends to outperform other approaches, like collaborative filtering. This recipe takes the longest to train, so let's start with this recipe first.\n",
    "\n",
    "For our use case, using the LastFM data, we can use HRNN to recommend new artists to a user based on the user's previous artist tagging behavior. Remember, we used the tagging data to represent positive interactions between a user and an artist.\n",
    "\n",
    "First, select the recipe by finding the ARN in the list of recipes above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "HRNN_recipe_arn = \"arn:aws:personalize:::recipe/aws-hrnn\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create the solution\n",
    "\n",
    "First you create a solution using the recipe. Although you provide the dataset ARN in this step, the model is not yet trained. See this as an identifier instead of a trained model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-hrnn\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"bf01d5cb-4cf2-45b0-ab2a-812d249d52fe\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:09:56 GMT\",\n",
      "      \"x-amzn-requestid\": \"bf01d5cb-4cf2-45b0-ab2a-812d249d52fe\",\n",
      "      \"content-length\": \"95\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "hrnn_create_solution_response = personalize.create_solution(\n",
    "    name = \"personalize-poc-hrnn\",\n",
    "    datasetGroupArn = dataset_group_arn,\n",
    "    recipeArn = HRNN_recipe_arn\n",
    ")\n",
    "\n",
    "hrnn_solution_arn = hrnn_create_solution_response['solutionArn']\n",
    "print(json.dumps(hrnn_create_solution_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create the solution version\n",
    "\n",
    "Once you have a solution, you need to create a version in order to complete the model training. The training can take a while to complete, upwards of 25 minutes, and an average of 40 minutes for this recipe with our dataset. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create many models and deploy them quickly. So we will set up the while loop for all of the solutions further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "hrnn_create_solution_version_response = personalize.create_solution_version(\n",
    "    solutionArn = hrnn_solution_arn\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionVersionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-hrnn/f1383f2f\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"d25f261d-6fa5-4faa-9210-fa43474382d9\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:09:56 GMT\",\n",
      "      \"x-amzn-requestid\": \"d25f261d-6fa5-4faa-9210-fa43474382d9\",\n",
      "      \"content-length\": \"111\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "hrnn_solution_version_arn = hrnn_create_solution_version_response['solutionVersionArn']\n",
    "print(json.dumps(hrnn_create_solution_version_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SIMS\n",
    "\n",
    "\n",
    "SIMS is one of the oldest algorithms used within Amazon for recommendation systems. A core use case for it is when you have one item and you want to recommend items that have been interacted with in similar ways over your entire user base. This means the result is not personalized per user. Sometimes this leads to recommending mostly popular items, so there is a hyperparameter that can be tweaked which will reduce the popular items in your results. \n",
    "\n",
    "For our use case, using the LastFM data, let's assume we pick a particular artist. We can then use SIMS to recommend other artists based on the tagging behavior of the entire user base. The results are not personalized per user, but instead, differ depending on the original artist we chose as our input.\n",
    "\n",
    "Just like last time, we start by selecting the recipe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "SIMS_recipe_arn = \"arn:aws:personalize:::recipe/aws-sims\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create the solution\n",
    "\n",
    "As with HRNN, start by creating the solution first. Although you provide the dataset ARN in this step, the model is not yet trained. See this as an identifier instead of a trained model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-sims\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"bb3fbc5f-2b61-40d1-9fb9-0b48170d4ee4\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:09:56 GMT\",\n",
      "      \"x-amzn-requestid\": \"bb3fbc5f-2b61-40d1-9fb9-0b48170d4ee4\",\n",
      "      \"content-length\": \"95\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "sims_create_solution_response = personalize.create_solution(\n",
    "    name = \"personalize-poc-sims\",\n",
    "    datasetGroupArn = dataset_group_arn,\n",
    "    recipeArn = SIMS_recipe_arn\n",
    ")\n",
    "\n",
    "sims_solution_arn = sims_create_solution_response['solutionArn']\n",
    "print(json.dumps(sims_create_solution_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create the solution version\n",
    "\n",
    "Once you have a solution, you need to create a version in order to complete the model training. The training can take a while to complete, upwards of 25 minutes, and an average of 35 minutes for this recipe with our dataset. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create many models and deploy them quickly. So we will set up the while loop for all of the solutions further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "sims_create_solution_version_response = personalize.create_solution_version(\n",
    "    solutionArn = sims_solution_arn\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionVersionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-sims/c9995c55\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"986d1477-955d-4d05-a0ad-514741ce7605\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:09:56 GMT\",\n",
      "      \"x-amzn-requestid\": \"986d1477-955d-4d05-a0ad-514741ce7605\",\n",
      "      \"content-length\": \"111\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "sims_solution_version_arn = sims_create_solution_version_response['solutionVersionArn']\n",
    "print(json.dumps(sims_create_solution_version_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Personalized Ranking\n",
    "\n",
    "Personalized Ranking is an interesting application of HRNN. Instead of just recommending what is most probable for the user in question, this algorithm takes in a user and a list of items as well. The items are then rendered back in the order of most probable relevance for the user. The use case here is for filtering on genre for example, or when you have a broad collection that you would like better ordered for a particular user.\n",
    "\n",
    "For our use case, using the LastFM data, we could imagine that a particular record label is paying us to recommend their artists to our users in a special promotion. Therefore, we know the list of artists we want to recommend, but we want to find out which of these artists each user will like most. We would use personalized ranking to re-order the list of artists for each user, based on their previous tagging history. \n",
    "\n",
    "Just like last time, we start by selecting the recipe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "rerank_recipe_arn = \"arn:aws:personalize:::recipe/aws-personalized-ranking\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create the solution\n",
    "\n",
    "As with the previous solution, start by creating the solution first. Although you provide the dataset ARN in this step, the model is not yet trained. See this as an identifier instead of a trained model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-rerank\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"83e117a9-b210-4496-9b9e-01acd389b073\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:09:57 GMT\",\n",
      "      \"x-amzn-requestid\": \"83e117a9-b210-4496-9b9e-01acd389b073\",\n",
      "      \"content-length\": \"97\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "rerank_create_solution_response = personalize.create_solution(\n",
    "    name = \"personalize-poc-rerank\",\n",
    "    datasetGroupArn = dataset_group_arn,\n",
    "    recipeArn = rerank_recipe_arn\n",
    ")\n",
    "\n",
    "rerank_solution_arn = rerank_create_solution_response['solutionArn']\n",
    "print(json.dumps(rerank_create_solution_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create the solution version\n",
    "\n",
    "Once you have a solution, you need to create a version in order to complete the model training. The training can take a while to complete, upwards of 25 minutes, and an average of 35 minutes for this recipe with our dataset. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create many models and deploy them quickly. So we will set up the while loop for all of the solutions further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "rerank_create_solution_version_response = personalize.create_solution_version(\n",
    "    solutionArn = rerank_solution_arn\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionVersionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-rerank/134bd652\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"0401c0c2-5b50-4f24-bcce-85ca5d6d8ec4\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:09:57 GMT\",\n",
      "      \"x-amzn-requestid\": \"0401c0c2-5b50-4f24-bcce-85ca5d6d8ec4\",\n",
      "      \"content-length\": \"113\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "rerank_solution_version_arn = rerank_create_solution_version_response['solutionVersionArn']\n",
    "print(json.dumps(rerank_create_solution_version_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### View solution creation status\n",
    "\n",
    "As promised, how to view the status updates in the console:\n",
    "\n",
    "* In another browser tab you should already have the AWS Console up from opening this notebook instance. \n",
    "* Switch to that tab and search at the top for the service `Personalize`, then go to that service page. \n",
    "* Click `View dataset groups`.\n",
    "* Click the name of your dataset group, most likely something with POC in the name.\n",
    "* Click `Solutions and recipes`.\n",
    "* You will now see a list of all of the solutions you created above,  including a column with the status of the solution versions. Once it is `Active`, your solution is ready to be reviewed. It is also capable of being deployed.\n",
    "\n",
    "Or simply run the cell below to keep track of the solution version creation status."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "Build succeeded for arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-sims/c9995c55\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "At least one solution build is still in progress\n",
      "Build succeeded for arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-hrnn/f1383f2f\n",
      "At least one solution build is still in progress\n",
      "Build succeeded for arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-rerank/134bd652\n"
     ]
    }
   ],
   "source": [
    "in_progress_solution_versions = [\n",
    "    hrnn_solution_version_arn,\n",
    "    sims_solution_version_arn,\n",
    "    rerank_solution_version_arn\n",
    "]\n",
    "\n",
    "max_time = time.time() + 3*60*60 # 3 hours\n",
    "while time.time() < max_time:\n",
    "    for solution_version_arn in in_progress_solution_versions:\n",
    "        version_response = personalize.describe_solution_version(\n",
    "            solutionVersionArn = solution_version_arn\n",
    "        )\n",
    "        status = version_response[\"solutionVersion\"][\"status\"]\n",
    "        \n",
    "        if status == \"ACTIVE\":\n",
    "            print(\"Build succeeded for {}\".format(solution_version_arn))\n",
    "            in_progress_solution_versions.remove(solution_version_arn)\n",
    "        elif status == \"CREATE FAILED\":\n",
    "            print(\"Build failed for {}\".format(solution_version_arn))\n",
    "            in_progress_solution_versions.remove(solution_version_arn)\n",
    "    \n",
    "    if len(in_progress_solution_versions) <= 0:\n",
    "        break\n",
    "    else:\n",
    "        print(\"At least one solution build is still in progress\")\n",
    "        \n",
    "    time.sleep(60)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Hyperparameter tuning\n",
    "\n",
    "Personalize offers the option of running hyperparameter tuning when creating a solution. Because of the additional computation required to perform hyperparameter tuning, this feature is turned off by default. Therefore, the solutions we created above, will simply use the default values of the hyperparameters for each recipe. For more information about hyperparameter tuning, see the [documentation](https://docs.aws.amazon.com/personalize/latest/dg/customizing-solution-config-hpo.html).\n",
    "\n",
    "If you have settled on the correct recipe to use, and are ready to run hyperparameter tuning, the following code shows how you would do so, using SIMS as an example.\n",
    "\n",
    "```python\n",
    "sims_create_solution_response = personalize.create_solution(\n",
    "    name = \"personalize-poc-sims-hpo\",\n",
    "    datasetGroupArn = dataset_group_arn,\n",
    "    recipeArn = SIMS_recipe_arn,\n",
    "    performHPO=True\n",
    ")\n",
    "\n",
    "sims_solution_arn = sims_create_solution_response['solutionArn']\n",
    "print(json.dumps(sims_create_solution_response, indent=2))\n",
    "```\n",
    "\n",
    "If you already know the values you want to use for a specific hyperparameter, you can also set this value when you create the solution. The code below shows how you could set the value for the `popularity_discount_factor` for the SIMS recipe.\n",
    "\n",
    "```python\n",
    "sims_create_solution_response = personalize.create_solution(\n",
    "    name = \"personalize-poc-sims-set-hp\",\n",
    "    datasetGroupArn = dataset_group_arn,\n",
    "    recipeArn = SIMS_recipe_arn,\n",
    "    solutionConfig = {\n",
    "        'algorithmHyperParameters': {\n",
    "            'popularity_discount_factor': '0.7'\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "sims_solution_arn = sims_create_solution_response['solutionArn']\n",
    "print(json.dumps(sims_create_solution_response, indent=2))\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluate solution versions <a class=\"anchor\" id=\"eval\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "It should not take more than an hour to train all the solutions from this notebook. While training is in progress, we recommend taking the time to read up on the various algorithms (recipes) and their behavior in detail. This is also a good time to consider alternatives to how the data was fed into the system and what kind of results you expect to see.\n",
    "\n",
    "When the solutions finish creating, the next step is to obtain the evaluation metrics. Personalize calculates these metrics based on a subset of the training data. The image below illustrates how Personalize splits the data. Given 10 users, with 10 interactions each (a circle represents an interaction), the interactions are ordered from oldest to newest based on the timestamp. Personalize uses all of the interaction data from 90% of the users (blue circles) to train the solution version, and the remaining 10% for evaluation. For each of the users in the remaining 10%, 90% of their interaction data (green circles) is used as input for the call to the trained model. The remaining 10% of their data (orange circle) is compared to the output produced by the model and used to calculate the evaluation metrics.\n",
    "\n",
    "![personalize metrics](../static/imgs/personalize_metrics.png)\n",
    "\n",
    "We recommend reading [the documentation](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html) to understand the metrics, but we have also copied parts of the documentation below for convenience.\n",
    "\n",
    "You need to understand the following terms regarding evaluation in Personalize:\n",
    "\n",
    "* *Relevant recommendation* refers to a recommendation that matches a value in the testing data for the particular user.\n",
    "* *Rank* refers to the position of a recommended item in the list of recommendations. Position 1 (the top of the list) is presumed to be the most relevant to the user.\n",
    "* *Query* refers to the internal equivalent of a GetRecommendations call.\n",
    "\n",
    "The metrics produced by Personalize are:\n",
    "* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).\n",
    "* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.\n",
    "* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.\n",
    "* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.\n",
    "\n",
    "Let's take a look at the evaluation metrics for each of the solutions produced in this notebook. *Please note, your results might differ from the results described in the text of this notebook, due to the quality of the LastFM dataset.* \n",
    "\n",
    "### HRNN metrics\n",
    "\n",
    "First, retrieve the evaluation metrics for the HRNN solution version."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionVersionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-hrnn/f1383f2f\",\n",
      "  \"metrics\": {\n",
      "    \"coverage\": 0.0438,\n",
      "    \"mean_reciprocal_rank_at_25\": 0.0502,\n",
      "    \"normalized_discounted_cumulative_gain_at_10\": 0.0825,\n",
      "    \"normalized_discounted_cumulative_gain_at_25\": 0.0911,\n",
      "    \"normalized_discounted_cumulative_gain_at_5\": 0.0597,\n",
      "    \"precision_at_10\": 0.016,\n",
      "    \"precision_at_25\": 0.0082,\n",
      "    \"precision_at_5\": 0.0188\n",
      "  },\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"f369d7ae-5e5c-451b-a3be-9dc8a4bf0f4a\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:46:01 GMT\",\n",
      "      \"x-amzn-requestid\": \"f369d7ae-5e5c-451b-a3be-9dc8a4bf0f4a\",\n",
      "      \"content-length\": \"408\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "hrnn_solution_metrics_response = personalize.get_solution_metrics(\n",
    "    solutionVersionArn = hrnn_solution_version_arn\n",
    ")\n",
    "\n",
    "print(json.dumps(hrnn_solution_metrics_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The normalized discounted cumulative gain above tells us that at 5 items, we have less than a 6% chance (5.9% literally) in a recommendation being a part of a user's interaction history (in the hold out phase from training and validation). Around 4% of the recommended items are unique, and we have a precision of only 1% in the top 5 recommended items. \n",
    "\n",
    "This is clearly not a great model, but keep in mind that we had to use tagging data for our interactions because the listening data did not have timestamps associated with it. These results indicate that the tagging data may not be as relevant as we hoped. But, let's take a look at the results of the other solutions first.\n",
    "\n",
    "### SIMS metrics\n",
    "\n",
    "Now, retrieve the evaluation metrics for the SIMS solution version."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionVersionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-sims/c9995c55\",\n",
      "  \"metrics\": {\n",
      "    \"coverage\": 0.0856,\n",
      "    \"mean_reciprocal_rank_at_25\": 0.0266,\n",
      "    \"normalized_discounted_cumulative_gain_at_10\": 0.0302,\n",
      "    \"normalized_discounted_cumulative_gain_at_25\": 0.0402,\n",
      "    \"normalized_discounted_cumulative_gain_at_5\": 0.0213,\n",
      "    \"precision_at_10\": 0.0104,\n",
      "    \"precision_at_25\": 0.0065,\n",
      "    \"precision_at_5\": 0.0104\n",
      "  },\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"b4c5e298-a4b6-4607-be33-0c1d259108cd\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:46:01 GMT\",\n",
      "      \"x-amzn-requestid\": \"b4c5e298-a4b6-4607-be33-0c1d259108cd\",\n",
      "      \"content-length\": \"409\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "sims_solution_metrics_response = personalize.get_solution_metrics(\n",
    "    solutionVersionArn = sims_solution_version_arn\n",
    ")\n",
    "\n",
    "print(json.dumps(sims_solution_metrics_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example we are seeing a slightly lower precision at 5 items, a little over 1% at 1.04% this time. Effectively this is probably within the margin of error, but given that no effort was made to mask popularity, it may just be returning super popular results that a large volume of users have interacted with in some way. \n",
    "\n",
    "### Personalized ranking metrics\n",
    "\n",
    "Now, retrieve the evaluation metrics for the personalized ranking solution version."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"solutionVersionArn\": \"arn:aws:personalize:ap-southeast-2:002268754244:solution/personalize-poc-rerank/134bd652\",\n",
      "  \"metrics\": {\n",
      "    \"coverage\": 0.0021,\n",
      "    \"mean_reciprocal_rank_at_25\": 0.0249,\n",
      "    \"normalized_discounted_cumulative_gain_at_10\": 0.0399,\n",
      "    \"normalized_discounted_cumulative_gain_at_25\": 0.0524,\n",
      "    \"normalized_discounted_cumulative_gain_at_5\": 0.0268,\n",
      "    \"precision_at_10\": 0.0074,\n",
      "    \"precision_at_25\": 0.0057,\n",
      "    \"precision_at_5\": 0.0069\n",
      "  },\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"d8b6cd6e-a2c4-452c-9cbb-1f209f884361\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Thu, 09 Apr 2020 04:46:01 GMT\",\n",
      "      \"x-amzn-requestid\": \"d8b6cd6e-a2c4-452c-9cbb-1f209f884361\",\n",
      "      \"content-length\": \"411\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "rerank_solution_metrics_response = personalize.get_solution_metrics(\n",
    "    solutionVersionArn = rerank_solution_version_arn\n",
    ")\n",
    "\n",
    "print(json.dumps(rerank_solution_metrics_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just a quick comment on this one, here we see a very low precision. As this is also based on HRNN, this result is unexpected, but it could just be caused by an unfortunate test set. \n",
    "\n",
    "## Using evaluation metrics <a class=\"anchor\" id=\"use\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "It is important to use evaluation metrics carefully. There are a number of factors to keep in mind.\n",
    "\n",
    "* If there is an existing recommendation system in place, this will have influenced the user's interaction history which you use to train your new solutions. This means the evaluation metrics are biased to favor the existing solution. If you work to push the evaluation metrics to match or exceed the existing solution, you may just be pushing the HRNN to behave like the existing solution and might not end up with something better.\n",
    "* The HRNN Coldstart recipe is difficult to evaluate using the metrics produced by Amazon Personalize. The aim of the recipe is to recommend items which are new to your business. Therefore, these items will not appear in the existing user transaction data which is used to compute the evaluation metrics. As a result, HRNN Coldstart will never appear to perform better than the other recipes, when compared on the evaluation metrics alone. \n",
    "\n",
    "Keeping in mind these factors, the evaluation metrics produced by Personalize are generally useful for two cases:\n",
    "1. Comparing the performance of solution versions trained on the same recipe, but with different values for the hyperparameters.\n",
    "1. Comparing the performance of solution versions trained on different recipes (except HRNN Coldstart).\n",
    "\n",
    "Properly evaluating a recommendation system is always best done through A/B testing while measuring actual business outcomes. Since recommendations generated by a system usually influence the user behavior which it is based on, it is better to run small experiments and apply A/B testing for longer periods of time. Over time, the bias from the existing model will fade."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Storing useful variables <a class=\"anchor\" id=\"vars\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "Before exiting this notebook, run the following cells to save the version ARNs for use in the next notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Stored 'hrnn_solution_version_arn' (str)\n",
      "Stored 'sims_solution_version_arn' (str)\n",
      "Stored 'rerank_solution_version_arn' (str)\n"
     ]
    }
   ],
   "source": [
    "%store hrnn_solution_version_arn\n",
    "%store sims_solution_version_arn\n",
    "%store rerank_solution_version_arn"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}